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Abstract 

Motivated by applications in databases, this paper considers various 
fragments of the calculus of binary relations. The fragments are obtained 
by leaving out, or keeping in, some of the standard operators, along with 
some derived operators such as set difference, projection, coprojection, 
and residuation. For each considered fragment, a characterization is ob- 
tained for when two given binary relational structures are indistinguish- 
able by expressions in that fragment. The characterizations are based on 
appropriately adapted notions of simulation and bisimulation. 

1 Introduction 

The calculus of relations Q21 HH1 023 consists of five natural operations on 
binary relations: union, intersection, complementation, composition, and con- 
verse. These operators can be applied to given binary relations, combined with 
the four standard constant relations: empty, full, identity, and diversity. The 
calculus of relations is a very natural formalism and occurs within logics for rea- 
soning about binary relations, notably dynamic and description logics [T51 15]. 
The calculus also has motivated the development of the theory of relation al- 
gebras [^nilTTj. In the present paper, however, we are not looking at abstract 
relation algebras, but rather at the question of indistinguishability of two given 
finite binary relational structures within the calculus of relations. 

Indistinguishability of structures in various logics is one of the most basic 
concepts studied in finite model theory and in the study of the expressive power 
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of database query languages [El HE1 IS]- Indeed the calculus of relations, as 
a core relational algebra query language on binary relations, is very relevant 
to the field of databases. Binary relations, or, equivalently, directed graphs, 
show up naturally in data on the Web [TU], dataspaces [TT], Linked Data 
[3[TS], and RDF data pQ. Moreover, in restriction to directed graphs that are 
trees, the relational calculus is closely tied to the XML query language XPath, 
and the expressive power of XPath and various fragments has been intensively 
investigated [6l l22 l l23 l [14 ] . 

Here, working with general finite binary relation structures rather than trees, 
we consider, in addition to the five binary relation operations and four constant 
binary relations mentioned above, also four derived operations that are well 
known in the literature: set difference; projection; coprojection; and residua- 
tion. These derived operations can be expressed in terms of the other opera- 
tions and constants, but can still be interesting on their own when considering 
fragments where some other operations or constants have been left out. We con- 
sider set difference because it is the standard domain-independent alternative to 
complementation in database query languages [3]. We consider projection and 
coprojection (existential and universal quantification) because they are stan- 
dard logical operations, and have been shown important in the XPath setting, 
so it is natural to study their behaviour when generalising from trees to general 
graphs. Finally, we consider residuation because it is similar to the standard 
relational division operation in databases, and corresponds to the set contain- 
ment join |21j . Obviously, one could keep on inventing additional operations on 
binary relations and study their interdependencies, but we hope our chosen set 
of operations is not too large and well-motivated. 

Our goal now is to understand the relative importance of the various opera- 
tions and the effect of their presence on indistinguishability. Thereto we consider 
all possible fragments of the calculus of relations that can be constructed as fol- 
lows. The most basic fragment we consider has the empty and identity relations 
as constants, and the operations union, composition, and intersection. Then 
all other fragments arise by adding any choice of the remaining operations and 
constants. For each fragment, we provide a characterization of when two finite 
binary relation structures are indistinguishable by expressions in the fragment. 
Our approach is to come up with bisimilarity-like characterizations |28j . 

To conclude this Introduction, we note another motivation to understand 
indistinguishability in database query language fragments, apart from the in- 
trinsic foundational motivation. This is the new approach of structural indexing 
to database query processing, proposed by some of us and others 01 [22], whereby 
a given query expression is processed by accessing blocks of data indistinguish- 
able by the operations used in the given expression. By the results of our work, 
these blocks can be computed using similarity or bisimilarity checks. 

This paper is organized as follows. In Section 2 we define the language frag- 
ments formally, and define the notion of indistinguishability. In Section 3 we 
discuss different ways how indistinguishability can be characterized; in partic- 
ular we discuss the connection with multi-dimensional modal logics, and the 
3-variable fragment of first-order logic. In Section 4, we consider the frag- 



2 



ments with the set difference operation. In Section 5, wc consider the fragments 
without set difference. In Section 6 we show that indistinguishability of finite 
structures is decidable in polynomial time. We conclude in Section 7. 

2 Language fragments and indistinguishability 

We assume an infinite universe of atomic data elements, denoted by U. A binary 
relation on U is a subset of U 2 = U x U. Wc further fix an arbitrary finite set A 
of relation names, called the vocabulary. In the calculus of relations, a structure 
is then a pair Q = (V, (R g )r(z\) where V is a subset of U and each R? is a 
binary relation on V. The set V is called the set of nodes of Q\ the vocabulary 
A can be thought of as a set of edge labels whereby Q can be thought of as 
an edge-labeled directed graph. When V is finite, the structure is said to be a 
finite structure. 

Expressions in the calculus of relations are built recursively from the relation 
names R, and the constant symbols empty (0), all (1), diversity (0'), and identity 
(1'), using the following standard and/or derived operations. The standard 
operations are union (ei U e 2 ), intersection (ei n e 2 ), complementation (e c ), 
composition (e\ o ej), and converse (e _1 ); the derived operations we consider 
are set difference (ei — e 2 ), projection (it\e or 7r 2 e), co-projection {%\e or Ti 2 e), 
left residual (ex / e 2 ) and right residual (ei \ e 2 )^ 

Semantically, on any structure Q as above, an expression e defines a bi- 
nary relation, denoted by e{Q). For convenience, we recall the semantics of the 
constants and the standard operations. 



R{Q) 


= i? e ; 


0(G) 


= 0; 


m 


= V 2 ; 




= {(s,t) | s,t (EV k s ytt}; 


i'{Q) 


= {(s,s)\seV}; 


(eiUeaXa) 


= ei(0)Ue 2 (0); 


(eine 2 )(0) 


= e 1 {g)ne 2 (G); 


e c (G) 


= {(s,t)\s,teV k (s,t) £ e(G)}; 


(d o e 2 )(G) 


= {(s,t) | (3v)((s,v) e ex(g) k (v,t) g e 2 (G))}; 


(e- X )(0) 


= {(«,*) | (M)ee(S)}. 



x To distinguish between set difference and the right residual, we use the minus sign (— ) 
for set difference. 



3 




Figure 1: Example structure from Example [T] 

The semantics of the derived operations is as follows: 

(ei - e 2 ){Q) = {(s, t) | (s, t) e e x {g) & (s, t) £ e 2 (G)} 

7r 1 (e)(g) = {(s,s)\(3t)(s,t)ee(g)} 

n 2 (e)(g) = {(s,s)\(3t)(t,s)ee(g)} 

Tfi (e)(5) = {(s,s) | s e V & -.(3t)(a,t) e e(0)} 

7f 2 (e)(5) = {(a, a) | s € F & -.(3t)(*,«) e e(Q)} 
(ei/e 2 )(G) = {(a,t) | (V«)((t,«) € e 2 (5) -► (*,«) e e x (0))} 
(ei\e a )(a) = {(«,t) | (Vv)((i;,») e e x (5) -> («,t) € e 2 (£?))} 

Example 1. Figure [T] shows a finite structure Q. The set of nodes equals 
{migraine, flu, sue, umi, saori, sriram, st jude's, inco}, and the vocabulary A equals 
{knows, worksAt, patientOf , hasDisease}. 

• The doctors (i.e., persons having patients), can be retrieved from g by the 
expression 

ei = 7r 2 (patientOf) 
resulting in e\(g) = {(saori, saori)}. 

• The people and the doctors they know can be obtained by the expression 

e 2 = knows o ej 
resulting in e 2 (0) = {(kotaro, saori)}. 
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• The doctors and the hospitals where they practice: 



(ei o worksAt)(Cf) = {(saori,st jude's)}. 



• 111 people without medical care: 



(7Ti(hasDisease) — 7r 1 (patientOf))(0) = {(sue, sue)}. 



• Healthy doctors: 



(ei n 7Ti(hasDisease))(C7) = {(saori,saori)}. 



• Finally, the doctors who know all the patients of some other doctor can 
be retrieved by the expression 



which on our example graph yields the empty relation, since the graph 
contains only one doctor. 

Equivalence Two expressions e\ and e 2 are equivalent, denoted by e\ = e 2 , if 
e i(S) — 62(G) for all possible structures Q. The following equivalences demon- 
strate that the derived operations are indeed derived, and also present some 
additional interdependencies among the constants and operations considered in 
this paper: 



Language fragments We will consider various fragments of the calculus of 
relations. The most basic fragment we consider is denoted by C: it has the 
constants and V and the operators composition, union, and intersection. All 
other fragments are defined by adding to C some additional constants and oper- 
ators. The fragment C(ir), for example, consists of the expressions built up from 
the relation names, 0, and 1', using the operations composition, union, intersec- 
tion, and projection^] The full calculus of relations corresponds to the fragment 

2 We only consider fragments containing both the first and second projection (ni and ^2) 
or none of them. Similarly for coprojection, and also similarly, we only consider fragments 
containing both the left and right residual (/ and \) or none of them. 



e x n 7Ti ((knows / patientOf ) n 0'), 



1 = C 
0' = T c 

ei — e2 = ei n e 2 



e c = 1 - e 

7ri(e) ee (e o e- 1 ) H l' = (e o (0' U l')) n T 
7r 2 (e) ee (e- 1 o e) n 1' ee ((0' U T) o e) n l' 

TTi(e) EE 1' - TTi(e) 

e 1 / e2 EE(e^oe 2 - 1 ) c EE(er 1 \ e2 - 1 )- 1 
e^e^ie^oe^Y^ie^/e- 1 )- 1 



71-1(71-1 (e)) 
7r2(7T 2 (e)) 
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C( , ), since all other operations can be derived in it. We will characterize 
indistinguishability for all fragments that contain C. 

Indistinguishability A marked structure Q is a pair (Q,a,b) where Q is a 
relational structure, and (a, b) is an ordered pair of nodes from Q. Let T be 
a fragment of the calculus of relations. The T-type of Q, denoted by tpjr(G), 
is defined as the set of all expressions e € T such that (a,b) £ e(Q). For two 
marked structures Gi = (0i,oi, &i) and Q2 — {G2, 0,2^2), we write Q\ 
Q2 if tpjr(0i, 01, 61) C tpjr(^2, a,2, 62), i.e., for every expression e £ J 7 such 
that (ffli, 61) € e(Gi), also (02,62) € e(0 2 ). We then say that Q2 is one-sided 
indistinguishable from C/j in J 7 . When both Q\ Q2 and C/2 ^jf Si, we say 
that Qx and (?2 are indistinguishable in T and denote this by Q\ Q%- 

Since indistinguishability is the same as one-sided indistinguishability in 
both directions, it is more general to look for a characterization of one-sided 
indistinguishability, and indeed we will provide such characterizations whenever 
we can. On the other hand, we will verify in Proposition [4] that for any frag- 
ment where set difference is present, indistinguishability actually coincides with 
one-sided indistinguishability (except in a trivial case) . For these fragments, we 
will thus just talk about indistinguishability for short. 

In our characterizations of indistinguishabiliy we will also refer to the atomic 
J 7 - type of Q. The atomic expressions are those from the finite set Atom = 
{l',0',l,0}U{i?,i?- 1 I Re A}). The atomic "F-type of Q, denoted by atp^(^), 
then equals Atom n tpjr(ty). The set of atomic expressions belonging to the 
fragment T will be denoted by ats(J"). Note that atpjr(C7) is always a subset of 
ats(J r ). 

Degrees and paths It is customary to parameterize characterizations of in- 
distinguishability by the degree of expressions. For an expression e, we define 
the degree deg(e) of e as follows. Every atomic expression has degree zero. 
Then, 

deg(ei U e 2 ) = deg(ei H e 2 ) = deg(ex - e 2 ) = max(deg(e 1 ), deg(e 2 )); 
deg(e c ) = dcg(e _1 ) = deg(e); 

deg(ei o e 2 ) = deg(ei / e 2 ) = deg(d \ e 2 ) = 1 + max(deg(ei), deg(e 2 )); 
deg(7Ti(e)) = deg(7r 2 (e)) = deg(ffi(e)) = deg(7f 2 (e)) = 1 + deg(e). 

The degree of an expression is the maximum depth of nested applications of 
the composition, projection, co-projection, and the left and right residual oper- 
ation. Intuitively, the degree corresponds to the quantifier rank of the obvious 
translation of e into first-order logic. 

For a fragment J- of the calculus of relations and a natural number k, we 
denote the set of expressions in J- of degree at most k by ■ 

Definition 2 (F-fc-path). Let J 7 be a fragment of the calculus of relations, and 
let k be a natural number. 
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We define the expressions paths]f by induction on k as follows: 
pathsjf := I^J e 

eGatsf.?-") 

paths^ +1 := paths^ U(paths^ opaths^). 

Since T is by definition closed under union and composition, paths^ is in 
Tk- By definition, paths^ is equivalent to the union of all compositions of at 
most 2 fc atomic expressions in J ' . It is instructive to note that, for the most 
basic fragment C, given a structure G, a pair (a, b) is in paths^(C?) if and only 
if there is a directed path of length at most 2 k between a and b in Q viewed as 
a graph. Likewise for the fragment C( _1 ), but then for undirected paths. Note 
also that when 1, or just 0', is in J 7 , then 1 = 1' U 0' is always a subexpression 
of paths^, so that paths^ becomes equivalent to 1. Thus, when 1 or 0' is in T 
then for any structure Q and any k we have paths^ = V 2 where V is the node 
set of Q. 

The following lemma shows the relevance of paths^. 

Lemma 3. Let T be a fragment of the calculus of relations and let Q be a 
structure. For any expression e G To, we have e(Q) C pathsjf (<?). Furthermore, 
unless the residual operations / and \ are present in T, for any natural number 
k and any expression e G Tk, we have e(Q) C paths^(<7). 

Proof. By structural induction on e. If e is atomic, then e has degree zero and 
clearly e(Q) C pathsf(CJ) as pathsj^(C7) is the union of all atomic expressions in 
T. 

If e is ei U e 2 , ei n e 2 , or e\ — e 2 , the result follows immediately from the 
induction hypothesis. 

If e is 7Ti(ei), 7T2(ei), 7fi(ei), or 7T2(ei), the result is immediate because 
7i"i(ei)(<?) C l'(Q) C pathsf r (CJ). (Similarly for 7r 2 (ei), 7fi(ei), and 7f 2 (ei).) 

If e is ef, the result is trivial. Indeed, if complementation is present in T, 
then 0' = l' c as well, and since 1' U 0' = 1, we have paths^(tj) = 1(G)- Hence, 
pathsf r (tj) = 1(G) since paths^(C/) 72 pathsf(tj). The result now trivially fol- 
lows as e(Q) C 1(G) for any expression e. 

If e is eT 1 , first observe that, if the converse operation is present in T, then 
(b, a) € paths^(^) implies (a, 6) G paths^(CJ) as is readily verified by induction 
on fc. Now, assume (a,b) G e7 1 (^7). Then, (b, a) G ei(C/), and, by induction, 
(6, a) € paths^(CJ) whence (a,b) G paths^(C/). 

Finally, if e is ei o e 2 , let fci = deg(ei), fc 2 = dcg(e 2 ), and I = max(fci, fc 2 ). 
Note that k = £ + 1. Now assume (a, 6) G ei o e 2 (£). Then, for some c € V, 
we have (a, c) € ei(C/) and (c, 6) G e 2(£0- By induction, we have (a, c) G 
paths^(CJ) and (c, b) G paths}f 2 (£). Since pathsf(<5) C p&thsf +1 (G) for any i, 
we have (a, c) G pathsf"(£/) and (c, 6) G paths^(CJ). Hence (a, 6) G paths^(tj) as 
desired. □ 

Note the exception made in the above lemma for the residual operations: 
indeed, for example, the degree-one expression 0/0 is equivalent to 1, whereas 
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in general we do not have 1(G) Q pathsj^ (Q). (Note that paths fc is the same 
as paths^.) 

To conclude this section, we show, as promised earlier: 

Proposition 4. Let T be a fragment of the calculus of relations where set differ- 
ence or complementation is present. Let Gi = (Gi, o-i,b\) and G2 — (G2, 02,62) 
be two marked structures, and let k be a natural number. Consider the equiva- 
lence 

Gl ^T k Q2 Gl =F h Q2- 

This equivalence holds in each of the following cases: 

1. either complementation, 0', or 1 is present in J- '; 

2. k > and the residuals are present in T : 

3. (01,61) G pathsf(£i). 

If none of the above cases is satisfied, then Gi Q2 hold trivially and Gi =r k 
G2 holds if and only if (02, 02) ^ paAhs^ (G2) ■ 

Proof. Let us first argue the claim for when none of the three cases holds. Given 
the absence of cases 2 and 3, we obtain from Lemma[3]that (ai, 61) cannot belong 
to e(Qi) for any e G T^. Hence, Q\ ^j= k G2 is voidlessly satisfied. Moreover, 
clearly Gi =jr k Q- 2 iff (02,62) does not belong to e(Gi), for any e G J-'ki cither. 
We now note that the latter holds iff (a 2 , b 2 ) 4- pathsf^C^)- Indeed, the only-if 
is clear since paths^ belongs to Tu\ the if-direction is again given by Lemma [3J 
In order to prove the equivalence claim, assume Gi ^2! we must show 
that Q-2 ^j= k Gi- Thereto, let e£ J t such that (a 2 ,6 2 ) € e(Q 2 ); we must show 
that (ai,&i) G e(Qi). We consider the three different cases from the statement 
of the proposition. 

1. If either complementation, 0', or 1 is present, then the complement e c of 
an expression e of degree k is expressible by an expression of degree k. 
Indeed, if complementation is present, this is trivial; if 0' is present then 
we have e c = (f' U 0') — e; and if 1 is present then we have e c = 1 — e. 
(Note that J- has complement or set difference, so if complement is not 
present, the set difference is.) 

Now assume, for the sake of contradition, that (a\,b\) (£ e(Gi). Then 
(ai,&i) G e c (Gi, whence (a 2 ,6 2 ) € e c (Q 2 ) (by Gi ^?T k G2), whence 
(02,62) ^ e(C?2) which yields the desired contradiction. 

2. We have already noted that 0/0 = 1. Note that the degree of 0/0 equals 
one. Hence, if fc > and the residuals are present, then e c can be expressed 
as 0/0 — e which is still of degree fc. We can now reason as in the previous 
case. 

3. Recall that paths fc is an expression of degree fc. Now suppose (ai,i) G 
paths^ r (tJi), and assume again for the sake of contradiction that (a\, b\) ^ 
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e(Qi). Thus (ai,6i) € (paths^ —e)(Gi), whence (a 2 ,6 2 ) € (pathsf" — e)(C/ 2 ) 
by C? i ^jF fc £/ 2 . We obtain again the desired contradiction to the effect 
that (a 2 , 62) ^ e(£ 2 ). 

□ 

3 Approaches to bisimilarity 

Before characterizing indistinguishability for fragments of the calculus of rela- 
tions, let us first look at characterizations of the full calculus. Tarski and Givant 
showed that the calculus has equal expressive power as FO 3 , the 3- variable frag- 
ment of first-order logic [37]. For FO 3 , we have the 3-pebble Ehrcnfcucht-Frai'sse 
game as a characterization [5], [T5] . Marx and Venema, however, showed that the 
3- variable fragment of first-order logic has also equal expressive power as arrow 
logic |24j . a branch of multi-dimensional modal logic devised to provide a for- 
malization for simple reasoning about objects that are thought of as arrows. By 
this correspondence, bisimulations in terms of back-and-forth conditions that 
are well known from modal logic can be used to characterize fragments of FO 3 , 
and, hence, of the calculus of relations. 

Concretely, the language of arrow logic is a modal language with the dyadic 
operator o, the monadic operator eg), and the modal constant id. Formulas in 
arrow logic are built up from a set of propositional variables and the modal 
constant id, using the operators o and <g>, and the boolean connectives A, V, -1. 
Using propositional variables to denote edge labels; by interpreting the modal 
constant id as being true for pairs (a, a) of identical nodes; by interpreting the 
monadic operator 65 as being true for pairs ((&, a), (a, b)) of "arrows" such that 
the first arrow is the converse of the second arrow; and finally, by interpreting 
the dyadic operator o as being true for triples ((a, b), (a, c), (c, b)) of arrows such 
that the first one is obtained by composing the second and the third arrow, we 
can apply the characterization theorem of modal logic to immediately obtain a 
characterization for the full calculus of relations. We will next make this more 
precise. 

The notion of bisimulation for multi-dimensional modal logic, specialized to 
the above interpretation of arrow logic, becomes the following: 

Definition 5. Let Gi and G2 be two structures with node sets V\ and V 2 , 
respectively. A non-empty relation Z C Vi x V% is a bisimulation between Gi 
and G2 if it satisfies the following conditions]^] 

Atoms if (aj., 61, a 2 , 6 2 ) is in Z 1 then (ai,&i) € R{Gi) if and only if (a 2 ,6 2 ) € 
R{G 2 ), for all Re A; 

3 The attentive reader will notice that the converse-forth condition and the converse-back 
condition are identical. This is a consequence of the symmetry of the converse operator. We 
could have simplified the definition by removing one of the identical conditions, but preferred 
to stay in line with the general format of bisimulation conditions for multidimensional modal 
logic. 
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Forth if (01,61,02,62) £ Z 1 then 

composition(o) for each c\ <E V\ there exist C2 £ Vi such that both 
(ai, ci, a 2 , c 2 ) and (ci, 61, c 2 , 6 2 ) are in Z\ 

identity (id) if ai = 61 then a 2 = 6 2 ; 

converse(ig)) (61, ai, 6 2 , a 2 ) G 

Back if (01,61,02,62) is hi Z, then 

composition(o) for each C2 <E V 2 there exist ci € V\ such that both 
(ai, ci, a 2 , c 2 ) and (ci, 61, c 2 , 6 2 ) are in Z\ 

identity (id) if a 2 = 6 2 then ai =61; 

converse(®) (61, ai, 62, a 2 ) € Z\ 

A marked structure Q\ = (Gi, 01, 61) is said to be bisimilar to a marked structure 
Q2 = (^ 2 ,a 2 ,6 2 ) if there is a bisimulation Z between Q\ and Q2 containing 
(01,61,02,62). 

The following characterization can now be proved in an analogous way to 
known results in modal logic [15] : 

Proposition 6. Consider the full calculus of relations J- = C( c , _1 ). Let Q\ = 
((?i,Oi,6i) and Q 2 = (^ 2 ,a 2 ,6 2 ) be finite marked structures. Then 

Qi =jr Q 2 Qi is bisimilar to C? 2 . 

In database theory [3J , it is common to replace the complementation operator 
by the "safe" difference operator and, to compensate for this weaker operation, 
add the diversity operator. So, it is interesting to consider the fragment J-" sa fe = 
C(— , _1 , 0'), with which we deal later. Furthermore we can consider the "positive 
fragments" , without the difference operator, which yields the fragment J-T f = 

ct\o')- " 

For J^fe, the above characterization can be easily adapted. It suffices in 
the definition of bisimulation to remove the Back condition (thus obtaining a 
kind of simulation rather than bisimulation) , and add the following part to the 
forth-condition : 

diversity (di) if ai ^ 61, then a 2 7^ 6 2 . 

We can then analogously show that (Q±, a±, 61) ^-tt+ {G2, o 2 , 6 2 ) if and only if 

safe 

there exists a simulation from Q\ to Q2 containing (01,61,02,62). 

But if we now add to J 17 ^ the coprojection operators, however, it is no 
longer so easy to obtain a characterization of indistinguishability. Indeed, the 
coprojection operator, as a non-monotonic operator, cannot be expressed as a 
modality in the sense of modal logic. Another difficulty arises when we remove, 
e.g., the converse operator or the diversity relation. As expressions in the cal- 
culus always return pairs (a, 6) of nodes such that there is a path from a to 
6 in the graph formed by the atomic steps, it does not suffice to remove the 
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converse-forth or the diversity-forth parts in the definition of bisimulation; we 
also need to adapt the composition- forth part. 

In the rest of this section, as a concrete illustration of how to deal with 
these difficulties, we will define a notion of similarity that characterizes indis- 
tinguishability in C(tt), i.e., J^j e with coprojection added and diversity and 
converse removed. This illustration will serve as representative example for the 
later sections where we characterize indistinguishability for all fragments. For 
fragments that contain the difference operation, we will define an appropriate 
general notion of bisimilarity in Section |4j For fragments without difference, we 
will define a general notion of similarity in Section [5j 

3.1 Indistinguishability in C(tt) 

We begin by defining the appropriate similarity notion. Note that we define 
similarity up to a certain depth k. That is because an expression e in C(ff) of a 
fixed degree k, can only output pairs of nodes between which there is a path of 
length at most 2 k (cf. Lemma [3]) 

Definition 7 (C (^-similarity). Let k be a natural number and let 1 = 
(Gi,ai,bi) and G2 — (G2, 02, 62) be marked structures with node sets V\ and 
V2, respectively. We say that Gi is C '(ff) -similar to G2 up to depth k, denoted 
Gi ^fc^ G2, if the following conditions are satisfied: 

Atoms if a% = b\, then 02 = 62; furthermore, if (0,1,61) € R(Gi), then (02,62) G 
R{G 2 ), for all Re A] 

Composition Forth Only required when k > 0. For every c\ € V\ with 
(ai,Ci) and (ci,&i) in paths!^*? there exists c 2 £ V% with (a 2 ,C2) and 
(02,62) in paths^i(0a) such that both (<?i,ai,ci) ^$.-1 (G2, 0,2,02) and 
(GucM^I (S 2 ,c 2 ,6 2 ); 

Coprojection Forth Only required when k > and ai — h\ (whence a 2 = 62 
by the Atoms condition). For every c 2 6 V2 with (02, c%) in paths^^i (£2), 
there exists ci € V\ with (ai, Ci) in pathsj^j (0i) such that (£2, a 2 , C2) ^t^j 
(Si,Oi,Ci). Furthermore, for every c 2 € V 2 with (c 2 ,a 2 ) in paths^lj ^(£2), 
there exists ci € V\ with (ci, ai) in paths^_j \{Gi) such that {G2, 02, ^2) l^fc?i 

(Sl,Ci,ffli). 

Note how, in the Coprojection Forth condition, the direction of the similarity 
is reversed. 

It is instructive to already note the following property, although it will also 
follow from our later results: 

Proposition 8. IfGi ^1+1 ^2 then Gi ^1 

4 The sets paths^'^' and paths^ are equal for any k, and consist of the ordered pairs 
connected by a directed path of length at most 2 k . 
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Proof. There are two cases. If (a\,bi) G paths fc (C/i), then in the Composition 
Forth condition for Gi G2 we can take ci — a,\ and obtain C2 G V2 

such that (g 1 ,ai,a 1 ) ^ W (£ 2 ,a 2 ,c 2 ) and {Gi,ai,b x ) -< c k w (£ 2 ,c 2 ,6 2 ). By the 
Atoms condition for (0i,ai,ai) z^*^ <? 2 (a 2 ,c 2 ), we obtain c 2 = a 2 and thus 
(£i,ai,&i) < C k {n) Qi{a 2 M) as desired. 

If (ai,6i) ^ paths fc (£7i), then = (Gi,a>i,bi) z^t G2 is voidly satisfied. 

□ 

We now show that similarity is sufficient for one-sided indistinguishability 
up to degree k: (recall that for any fragment T , the set of expressions in T of 
degree at most k is denoted by Fk) 



Lemma 9 (Invariance lemma for C(ir)). If Gi z^f G2 then Gi 



Proof. Let Gi = (Gi,a,i,bi) and G2 = (G2, 0-2, ^2) and let V\ and V 2 be the node 
sets of the structures Gi and G2, respectively. Assume Gi Sa- We prove 

by structural induction that, for each expression e in C(n) k , if (ai,b±) G e(Gi), 
then (a 2 , 6 2 ) G e(£ 2 )- 

We only show the reasoning for the case where e is a coprojection. So, assume 
(di, 61) G 7fi(ei)(0i). Then, a\ — b\ and there does not exist c\ in V\ such that 
(ai,cx) G ei(CJi). By the atoms condition, we have that a 2 = b 2 . To show that 
(a 2 , a 2 ) G 7Ti(ei)(^ 2 ), it remains to be shown that there does not exist c 2 in V 2 
such that (a 2 , c 2 ) G ei{G 2 ). Suppose, for the purpose of contradiction, that such 
c 2 does exist. Note that e\ has degree at most k — 1, so, by Lemma[3j we have 
(a 2 ,c 2 ) G paths^^i (f/ 2 ). Hence, by the Coprojection Forth condition, we have 

(S 2 ,a 2 ,c 2 ) < C k [ *l (Gi , ai, ci). By the induction hypothesis, (ai,^) G ei(Gi), 
but this contradicts (ai,&i) G 7ri(ei)(<7i). 

The case where e is of the form 7r 2 (ei) is analogous. □ 

In order to show that similarity is also necessary for indistinguishability, the 
following lemma is crucial. 



Lemma 10 (Representation lemma for C(it)). Let k be a natural number and 
let Gi = (f3i,a-i,bi) be a marked structure with (<zi,&i) G paths^ 71 ^. Then there 
exists an expression e^^ ,k in C(7f) fe such that, for every structure Gi- 



e^' k (G 2 ) = {(a 2 ,b 2 ) G paths^^) | Gi ^ (G 2 ,a 2 ,b 2 )}. 

Proof. The construction of the required expression is by induction on k. For 
the base of the construction we put[^] 

e CW '° — Daths C(#) nY C(i) 
% —patns '% iatoins 

5 In this and in many later proofs, empty intersections should be interpreted as vanishing 
from the larger expression; empty unions should be interpreted as the expression 0. 
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where 

, = p 

atoms ' I I 

eeatp c(i) (Si) 

In the inductive step of the construction we define 

e C(ff),fe+l . = ths C(7f) n C(ff) n C(ff),fe+1 n C(»),fc+1 

C?! fe+1 A -C? lj atoms Q\ composition forth ^"C/i, coprojection forth 

where (with V\ the set of nodes of Q\) 

Y- W •= Pi e- 

atoms ' I I 

eeatp c(s) (6i) 

c(ff),fe+i , = p. c(7f),fc oe cw,fe 

i, composition forth I ! (0l,oi,ci) (Si,ci,6i)' 

ciGVi 
(oi,ci)epaths^ (,r) (Si) 

(ci,6i)epaths^ s) (ei) 

and the Coprojection Forth expression is equal to 1 if a\ ^ 61 (the Coprojection 
Forth condition is then indeed vacuously satisfied); if a\ = 61 it is defined as 

^copSection forth : =^( U e) H 7T 2 ( (J e). 

eeC(7f) fc eec(«) fc 
(oi,6i)e7fi(e)(Si) (oi,6i)£if2(e)(Si) 

Although the expressions given for the two Forth conditions are in principle 
infinite, they are equivalent to finite expressions. Indeed, note that the Compo- 
sition Forth expression is an intersection of expressions of degree at most k + 1, 
and that the Coprojection Forth expression involves a union over expressions of 
degree at most k. For any fixed k, there are only a finite number of expressions 
of degree k up to equivalence. As such, the infinite intersections and unions can 
be equivalently expressed as finite intersections and unions. 

To show the correctness of the above expression, first note that e^^' k indeed 
has degree at most k. It remains to be shown that, for any structure Q 2l 

e c_(n),k {g2) = {{a2M) e paths cw ( £ 2) 1 g x ^(f) { g^ aM} . 

This can be shown by induction on k. For k = it is clear that (a 2 ,6 2 ) <G 
(Q2) iff the Atoms condition for Q\ < C n^ (£2, 02,62) is satisfied. The 

Q\, atoms v ' ~~ u v ' 

equivalence between (02,62) € <z£^' fe+1 AQ2) and the Composition 

v ' r Qi .composition forth v ' 

Forth condition for Q\ ^+1 (£2,02,62) is also clear, assuming the induction 
hypothesis for k. So, in our reasoning below, we can focus on the expression 
corresponding to the Coprojection Forth condition. 

To prove the D-dircction of the asserted equality, let (o 2 , 6 2 ) € paths^^ (Q 2 ) 
with ft ^ (£2,o 2 ,6 2 ). Wc must show that (o 2 ,6 2 ) e X^^ ection ^(fe). 
If ai 7^ 61 then this is trivial, so we may assume 01 =61, in which case also 
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a 2 = 62 by the Atoms condition. Let V2 denote the set of nodes of Qi. The 
expression Y^^ k+1 . , , is an intersection of two coproiections: we focus 

C/ 1 .coprojection forth 

on the first coprojection, as the second one is dealt with in a similar way. Now 
suppose, for the sake of contradiction, that there exists C2 € V2 and e € C(ff)fc 
with (01,61) G 7fi(e)(0i) and (a 2 ,c 2 ) G e(£ 2 ). Since Gi r<fc+? (£2,02,62), we 
also have (02,62) G 7fi(e)(C?2). But then no ci can exist so that (02,02) € ^(£2) 
and we have a contradiction. 

For the C-direction, let (02,62) be in (£2)- It is clear that (02,62) G 

paths^y and that the Atoms condition and Composition Forth conditions are 
satisfied. To argue for the Coprojection Forth condition, assume a\ — b\, and 
let C2 G V2 such that (02,02) G paths^^^^)- Suppose, for the sake of con- 
tradiction, that there does not exist c\ € V\ with (oi,Ci) G paths^ (£1) 
such that (£2,02,02) z^fc (£i,oi,ci), or, by induction, such that (ai,ci) G 

„C(7f),fe ^ , ^ _ ^ ^ ^ - U C(fi),k 

,a 2 ,1 

C(#),fc+1 
Ci,coprojc 

have (02,02) = (02,62) G ^i( e fcj2 a* c 2 ))^ 2 )' -Hence, there does not exist c with 
the property that (02, c) G e %^^ C2 )(£2), which is a contradiction, since c = C2 
does have that property. The reasoning for the tt2 part of the Coprojection 
Forth condition is entirely analogous. □ 

We obtain the desired result: 

Theorem 11. Q x ^c(s-) fc £2 if and only ijQ x £2- 

Proof. We have already seen the if-direction in I nvar iance Lemma [9| For the 



e (ala 2 , C2 )(£i)- Thus ( a i;«i) = (aiA) G ^i( e (g 2 ,a 2 , C2 ))(£i)- Sincc that expres- 
sion is of degree at most k, and since (02, 62) G yS . . r , (£2), we also 

" ' Q 1, coprojection forth v 



only-if direction, we use the Representation Lemma 10 Let Gi — (Gi, oi, 61) and 
G2 = (£1,02,62). Assume for the moment that (01,61) G paths? (£1). Then, 
<fc (7r) £1, we have (01,61) G e^ ( " 

wllPTlfP 

Si 



since trivially £1 ^^ < "' r ' ) £1, we have (01,61) G &^ k {Gi), whence (02, 6 2 ) G 
e^ 7r - ),fe (C?2), whence £1 ^ £2 as desired. If (01,61) ^ paths^" 1 (£1), then 
£1 l^fc^ £2 is voidly satisfied. □ 



4 Bisimilarity for fragments with set difference 
4.1 Fragments without the residuals 

In this section, J- is an arbitrary fixed fragment of the calculus of relations in 
which set difference or complementation is present, but the residual operations 
are not. 

Definition 12 (Bisimilarity excl. residuals). Let A; be a natural number and 
let Gi = (£1,01,61) and G2 = (£2,02,62) be marked structures with node sets 
V\ and V2, respectively. We say that Gi is J- -bisimilar to G2 up to depth k, 
denoted Gi —k £2, if the following conditions are satisfied: 



14 



Atoms atp jr (C/i) = atp jr (C/ 2 ); 

Composition Forth if k > 0, then, for every c\ in V\ with (a 1; ci) and 
(ci,6i) in pathsfl 1 (C/i), there exists 02 in V 2 with (02,02) and (02,62) in 
pathsf.-L^) such that both [Q 1 ,a 1 ,c 1 ) ~£_ x (^2,a 2 ,c 2 ) and {Q 1 ,c 1 ,b 1 ) 
(£2,02,62); 

Projection Forth if 7r is in (T), if > 0, and if a\ = 61, then, for every c\ in V\ 
with (a 1; ci) in pathsf r _i(^/i) (resp., (ci,ai) in pathsf r _i(^/i)), there exists 
c 2 in y 2 with (a 2 ,c 2 ) in pathsf r _ 1 (5 2 ) (resp., (c 2 ,a 2 ) in paths£Li(£ 2 )) such 
that {Qi,ax,cx) (£ 2 ,a 2 ,c 2 ) (resp., (£i,ci,ai) (£2,22,02)); 

Composition Back if fc > 0, then, for every C2 in V% with (02, c 2 ) and (c 2 , 62) 
in paths^ r _ 1 (^2), there exists ci in V\ with (di, ci) and (ci, 61) in paths^ r _ 1 (£i) 
such that both {Q 1 ,ai,c 1 ) (£2,a 2 ,c 2 ) and (£i,ci,&i) (£2, c 2 , 62); 

Projection Back if tt is in (J 7 ), if k > 0, and if ai = 61, then, for every C2 in V 2 
with (02,02) in pathsf r _ 1 (C/2) (resp., (02,02) in paths^_ 1 (£ 2 ))) there exists 
ci in V\ with (ai,ci) in pathsf r _i(£i) (resp., (ci,ai) in paths^_ 1 (C/i)) such 
that (<?i,ai,ci) (£ 2 ,a 2 ,c 2 ) (resp., (£i,ci,ai) (£2,02,02)). 

The reader may wonder why there are no Coprojection Forth and Back 
conditions in the above definition. The reason is that, since difference is present 
in the fragment, coprojection is present if and only if projection is, by the 
equivalences 7fi(e) = V — 7Tj(e) and 7Tj(e) = 7fj(7Ti(e)). Thus, it suffices to have 
Projection Forth and Back conditions. 

Lemma 13 (Invariance lemma — bisimilarity excl. residuals). If Gi Gi then 
£1 =F k £2- 

Proof. Let Gi = (£i> 01, &i) and G2 = (£2, a 2, b 2 ) and let V\ and V 2 be the node 
sets of the structures Gi and £2, respectively. Let e be an expression in Th- 
We prove by induction on the structure of e that (01, 61) € e(£i) if and only if 
(02,62) € e(£ 2 )- 

For the base case where e is an atomic expression, the result follows imme- 
diately from the atoms condition in the definition of bisimilarity. 

If e is ei U e 2 , e± n e 2 , &\ — e 2 , or e\, the result follows immediately from the 
induction hypothesis. 

For the case where e is e\ oe 2 , consider the only- if, i.e., assume that (a±,bi) € 
e(£i). By definition of composition, there exists ci in V\ with (ai,ci) € ei(£i) 
and (ci,&i) € e2(£i). Since ei and e2 have depth at most k — 1, by Lemma[3j 
both (ai,ci) and (ci,6i) are in paths^ r _ 1 (£i). By the Composition Forth con- 
dition, there exists C2 in V 2 such that both (£i,ai,ci) (£2,02,02) and 
(£i,ci,&i) —k-i (£2,02,62)- By induction, we have (a 2 ,c 2 ) € ei(£ 2 ) and 
(02, b 2 ) G e2(£2), whence (02, b 2 ) £ e\oe 2 {G 2 ). The argument for the if-direction 
is similar, using Composition Back instead of Composition Forth. 

Now let e be of the form e^f . By a straightforward argument using induction 
on k, one can verify for fragments J- in which the converse operator is present, 
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that (<7i,ai,6i) ~f (£2,02,62) implies (£1,61, <zi) ~f (£2,62,02). Wc thus 
obtain by induction that (tJi,&i) £ e(Gi) iff (61, ai) € e±(Gi) iff (62,02) € ei(C/2) 
iff (02,62) G e(C/2) as desired. 

For the case where e is 7r 1 (e 1 ), consider the only-if, i.e., assume (<Zi,&i) € 
7Ti(ei)(^i). By dehnition of projection, we have a% =61, and there exists c\ in 
Vi with (ai,ci) € ei(0i). By LemmaK3l we have (ai,ci) in paths^L 1 (^i). By 
the Projection Forth condition, there exists C2 in V2 such that (Gi, ai, Ci) ~ Jl 1 
(!?2,a 2 ,c 2 ). By induction, we have (a 2 ,c 2 ) € ei(£ 2 ), whence (a 2 , 6 2 ) € 7ri(ei)(£ 2 ). 
The argument for the if-direction is similar, using Projection Back instead of 
Composition Forth. The argument for the case where e is 7r 2 (ei) is analo- 
gous. □ 

Lemma 14 (Representation lemma — bisimilarity excl. residuals). Let k be a 
natural number and let Gi = (Gi^ai, 61) be a marked structure with {a±,b±) £ 
paths^. Then there exists an expression e^ :k in Tk such that for every structure 
G 2 : 



r *{G 2 ) = {(a 2 ,6 2 ) £ pathsf (G 2 ) I Gi ^ (£2,02,62)}. 



Proof. The construction of the required expression is by induction on k. For 
the base of the construction we put 



<£'° : = (P aths iTn^. nosatoms )-^ 



Qi ,posatoms^ Qi ,ncgatoms 

where 



(p^ := (| e and ^ := e. 

r O-, nnsatnnw | | ~ y x ,ncgatom.S \_y 



,1 

G 1 ,posatoms I I ^Gi .ncgatoms 

eGatpjr(5!) egats(.F) — atp^Si 



In the inductive step of the construction we dehne 

^Qi ' ((P^^^fe+l ^^Q-i ,posatoms^ ^Gi, ncgatoms) 

1 .composition forth 1 composition back 



n <j5— n Lp— ' 

G 1 .projection forth G 1 , projection back 



where (with V\ the node set of £1) 



^Gi .composition forth ' _ I I e (Gi ,ai ,ci ) ° E (Gi ,ci ,61) ' 



ciGVi 
(ai,ci)£paths;f (d) 

(ci,6i)Gpathsf (Gi) 



1G 



Qi, composition back 



U ( (1 Cpathaf-^^)) 

(ai.ci)epathsjf (Si) 
(ci,6i)epathsf (Si) 



n (p aths f- e 5i fe ci,6i)))i 

cievi-y 

(ai,ci)epaths^(Si) 



and the Projection Forth and Projection Back expressions are omitted from e^' k 
if a\ 7^ bi or 7r is not present in (J 7 ); otherwise they are defined as 

01 .projection forth ' _ I 1 ^ MSl ,01 ,ci ) ' 



ciGVi 
(oi,ci)6paths^(Si) 



ciGVi 
(ci,ai)£pathsf (Si) 



and 



^?' fe+1 . , , == (I'-tti (paths? - II ef a k >)) 

^Si, projection back \ 1 VJ ^ fc W (Si ,01 ,ci) 7 V 

CiGVi 
(ai,ci)€paths k (Si) 



7r 2 (pathsf- |J ej,'f ciai) ). 



ciGVi 
(ci,ai)6paths fc (Si) 

To show the correctness of the above expression, first note that e^' k indeed 

Si 

belongs to T^. It remains to be shown that, for any structure Q 2 , 

ej; fc (£ 2 ) = {(a 2 ,b 2 ) G pathsf (£ 2 ) | ~Q X ~f (g 2 ,a 2 ,& 2 )}- 

This can be shown by induction on fc. For fc = 0, it is clear that (a 2 ,6 2 ) € 
e^'°(G 2 ) iff a tpjr(^i) = atpjr(C/2)) i.e., iff the Atoms condition for Q\ ~q" 
(£2,02,62) is satisfied. Furthermore, assuming the induction hypothesis for 
fc, the following equivalences are readily verified for any (a 2 , b 2 ) € paths^ +1 (C/2): 
(the second and third equivalences assume ai =61) 

• (02,62) € ojJ ,/c+1 . . , , (Q 2 ) iff the Composition Forth condition for 

v ' r Si .composition forth v 7 

Qi (G 2 ,a 2 ,b 2 ) is satisfied. 

• (02,62) € ((v£ . . )n^' fe+1 . )(&) iff the 

v 7 vvr C/i .posatoms Qi.negatoms 7 Q i .projection forth 7 v 7 

Atoms and Projection Forth conditions for Gi (Q 2 ,a 2 ,b 2 ) are satis- 

fied. 
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• (02,62) e (M t -v£ t )n^' fe+1 . , , ,)(S 2 ) iff the 

x y vv *Vi,posatoms y 1 ^cgatoms' 1 , projection back' v ' 

Atoms and Projection Back conditions for Si (0 2 , 02,62) are sat- 

isfied. 

It remains to prove that (02,62) € ip^' k+1 . . , . (S2) iff the Composi- 

v ' r decomposition back v ' 

tion Back condition for Si (9 2 ,a 2 ,b 2 ) is satisfied. By inspection of the 

expression, we see that (02,62) € ip^' k+1 . . , , (S2) iff there is no subset 

v ' r Qi, composition back v ' 

V C Vi for which there exists C2 in V2 such that 

y Cl eV : ((oi,ci) e pathsf (Si) A (ci,6i) € pathsf (Si)) => 

((03,02) G pathsf (S2) A (a 2 , ca) £ egf.ai.d)^)) 

and 

Vci ^ V" : ((01, ci) G pathsf (Si) A (ci,&i) G paths£(&)) => 

((02,62) G pathsf (G 2 ) A (c 2 ,6 2 ) £ ef£ ciM) (g 2 )). 
In other words, and by induction, for every V C Vi and for all C2 in V2, we have 

3ci G V" : (ai,ci) e pathsf (Si) A (ci,6i) € pathsf (Si) A 

((o 2 ,c 2 ) G pathsf (£ 2 ) (Si,ai,ci) ~f («52,o 2 ,c 2 )) 

or 

3ci ^ V : (oi,Ci) G pathsf (Si) A (ci,6i) € pathsf (Si) A 

((c 2 ,6 2 ) G pathsf (S2) (Si,ci,6i) ~f (&, 02,62))- 
More formally, for every C2 in V2, we have 

f\ ( \/ (( a 2,c 2 ) G pathsf (S2) => (Si,oi,ci) ~f (S2,o 2 ,c 2 )) 

(oi,ci)£paths fc 
(ci,6i)epathsf (Si) 

V V ((c 2 ,6 2 )Gpathsf(S2)^(Si,ci,6i)~f (S 2 ,c 2 ,6 2 ))). 

ciGVi-v 

(oi,ci)epathsf (Si) 
(ci,6i)£pathsf (Si) 

Now, using commutativity of the logical 'or', and distributivity of the logical 
'and' over the logical 'or', we can equivalently write the above as follows, still 
for every c 2 in V 2 : 

\/ ((02, c 2 ) G pathsf (S2) => (Si, ai, ci) ~£ (S2, a 2 , c 2 ) 

ciGVi 
(ai,ci)6paths fc (Si) 
(ci,6i)epathsf (Si) 

A (c 2 ,6 2 ) G pathsf (S2) => (Si,ci,6i) ~f (S 2 ,c 2 ,6 2 )), 
which is exactly the Composition Back condition. □ 
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We conclude: 
Theorem 15. Gi =jr k Gi if and only if Gi Gi- 

Proof. The if-direction is given by the Invariance Lemma. For the only-if- 
direction, we use the Representation Lemma. Let Gi = {G\,a\,b\) and Gi = 
(Gi, 02,62)- If {ai,bi) € paths^(^i), then, since trivially Gi — Gi, we have 
(ai,6i) E e^{G\), whence (a 2 ,b 2 ) € e^'^(Gi), whence Gi G2 as desired. If 

(ai, bi) pathsf r (tJi), then the Composition Forth and Back and the Projection 
Forth and Back conditions are void. The atoms condition is satisfied because 
G\ =T k Gi- We again conclude that Gi —~[ Gi- □ 

4.2 Fragments with the residuals 

In this section, T is an arbitrary fixed fragment of the calculus of relations in 
which both set difference or complementation, and the residual operations are 
present. 

Definition 16 (Bisimilarity incl. residuals). Let J 7 be a fragment of the calculus 
of relations containing the difference and residual operations. Let A; be a natural 
number and let G\ = (Gi, ai,b\) and Gi = {Gi, ai, b 2 ) be marked structures with 
node sets V\ and V 2 , respectively. We say that Gi is .F-bisimilar to Gi up to 
depth fc, denoted Gi Gi, if the following conditions are satisfied: 

Atoms &tpjr(Gi) = atpjr(Gi); 

Composition Forth if k = 1, then, for every c\ in V\ with (ai,ci) and 
(ci, &i) in paths^(tJi), there exists C2 in V 2 such that both (Gi,ai,ci) 
(G2,a 2 ,c 2 ) and (<?i,Ci,&i) ~jf (Gi,c 2 ,b 2 ); 

if k > 1, then, for every c\ in V\, there exists c 2 in V 2 such that both 
(0i,ai,ci) (Gi,a 2 ,Ci) and {G\,CiM) {Gi,c 2 ,b 2 )] 

Projection Forth if tt is in (J 7 ), if k = 1, and if ai = 61, then, for every c\ in Vi 
with (ai, Ci) in pathsj^(C?i) (rcsp., (c 1; ai) in pathsf (<?i)), there exists C2 in 
V 2 such that (Gi,ai,ci) ~jf {Gi,a 2 ,c 2 ) (resp., (£i,ci,ai) ~jf (£ 2 ,c 2 ,a 2 )); 

if 7r is in (J 7 ), if fc > 1, and if a! =61, then, for every C\ in Vi, there exists 
c 2 in V 2 such that (</i,ai,Ci) (<? 2 ,a2,c 2 ) (rcsp., (</i,Ci,ai) ~{L 1 

(^2,c 2 ,a 2 )); 

Left Residual Forth if k = 1, then, for every C2 in V2 with (62, C2) in pathsf(£J 2 ) 
there exists ci in V\ such that both {Gi, b 2 ,c 2 ) ~q" &i, Ci) and {Gi,ai,c\) 
{Gi,a 2 ,c 2 ); 

if fc > 1, then, for every C2 in Vi, there exists ci in V\ such that both 
(Gi,b 2 ,c 2 ) (0i,6i,ci) and (</i,ai,Ci) ~%_ x {Gi,a 2 ,c 2 )-. 

Right Residual Forth if fc = 1, then, for every C2 in V 2 with (c 2 ,a 2 ) in 
pathsj^((?2), there exists c\ in V\ such that both (Gi,c 2 ,a 2 ) ~q {Gi,c\,a\) 
and (£?i,ci,&i) ~<f (£ 2 ,c 2 ,&2); 
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if fc > 1, then, for every c 2 in V%, there exists c± in V\ such that both 
(<?2,c 2 ,a 2 ) {Qi,ci,ai) and (Si,ci,&i) (<? 2 ,c 2 ,6 2 ); 

Composition Back if k = 1, then, for every c 2 in V 2 with (a 2 , c 2 ) and (c 2 , & 2 ) 
in pathsjf (G2), there exists ci in Vi such that both (CJi, a%, c%) ~(f (<? 2 , a 2 , c 2 ) 
and (t/!,ci,6i) ~<f (G 2 ,c 2 ,b 2 ); 

if fc > 1, then, for every c 2 in V 2 , there exists c\ in Vi such that both 
{Gi,ax,ci) (<? 2 ,a 2 ,c 2 ) and (£i,ci,6i) (<? 2 ,c 2 ,6 2 ); 

Projection Back if 7r is in (J 7 ), if fc = 1, and if ai = bi, then, for every c 2 in V 2 
with (a 2 , c 2 ) in paths^(C/ 2 ) (resp., (c 2 , a 2 ) in pathsjf (C* 2 )), there exists c\ in 
Vi such that {Gx,a x ,c x ) ~jf (G 2 ,a 2 ,c 2 ) (resp., (^i,^,^) ~f (£ 2 ,c 2 ,a 2 )); 

if 7r is in (J 7 ), if k > 1, and if a\ — bi, then, for every c 2 in V 2 , there exists 
ci in Fx such that (<?i, ai, Ci) (<? 2 ,a 2 ,c 2 ) (resp., (Si,ci,ai) 

(£ 2 ,c 2 ,a 2 )); 

Left Residual Back if k = 1, then, for every ci in Vi, there exists c 2 in V 2 
such that (<7 2 ,6 2 ,c 2 ) ~^ (£i,&i,ci), and either (a 2 ,c 2 ) ^ pathsf (G 2 ), or 
(^i,ai,ci) ~f (£ 2 ,a 2 ,c 2 ); 

if k > 1, then, for every c x in Vi, there exists c 2 in V 2 such that both 
(Gx,h,Cx) -k-i (G2,b 2 ,c 2 ) and (G 2 ,a 2 ,c 2 ) (Si,Oi,Ci); 

Right Residual Back if fc = 1, then, for every c\ in Vi, there exists c 2 in V 2 
such that (<7i,ci,ai) ~jf (£ 2 ,c 2 ,a 2 ), and either (c 2 ,6 2 ) ^ pathsf (£ 2 ), or 
(£ 2 ,c 2 ,& 2 ) ~jf (0i,ci,6i); 

if fc > 1, then, for every ci in Vi, there exists c 2 in V 2 such that both 
(Si,ci,ai) (£ 2 ,c 2 ,a 2 ) and (<? 2 ,c 2 ,6 2 ) (£i,ci,&i). 

Note that the conditions associated to the residuals have a special case for 
fc = 1. This is because the operands of a residual in an expression of degree 
k — 1, being expressions of degree 0, are necessarily contained in paths , whereas 
this need no longer be the case in higher-degree expressions; recall Lemma [3] 

Lemma 17 (Invariance lemma — bisimilarity incl. residuals). If G\ G2, then 
Gi =T k G 2 - 

Proof. The proof proceeds like the proof of Lemma |13| we only discuss what is 
new. 

For the case where e is e\ o e 2 , consider the only-if, i.e., assume (<zi,&i) £ 
e(Gi)- By definition of composition there exists c\ in Vi with (ax,ci) € ei(0i) 
and (ci, 61) € e 2 (C*i). If k — 1, we have that e\ and e 2 are in Fq. By Lemma [3j 
we have that both (ax,cx) and (cx,b±) are in pathsjf (£1). By the Composition 
Forth condition in the definition of bisimilarity, there exists c 2 in V 2 such that 
both (<?i,ai,ci) ~f (<? 2 ,a 2 ,c 2 ) and Ci, &i) ~f (<? 2 ,c 2 ,& 2 ). By induction, 
we have (a 2 ,c 2 ) € ei(C/ 2 ) and (c 2 ,6 2 ) € e 2 (£ 2 ), and hence (a 2 ,b 2 ) e ei oe 2 (C/ 2 ). 
If fc > 1 , the argument is again the same as in the proof of Lemma [13] 
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For the case where e is ey / e2, consider the only-if, i.e., assume (ai,&i) <E 
e(Gi). Suppose now that (02,62) ^ ei / e-i^Q-i)- Then, by definition of the left 
residual, there exists C2 in V% such that (1) (02,02) £ e 2 (£/2), and (2) (02,02) ^ 
e i(^2)- If fc = 1, we have that e 2 is in Fq. By Lemma[3j we have that (6 2 , C2) 
is in pathsj^C^)- By the Left Residual Forth condition in the definition of 
bisimilarity, there exists c\ in V\ such that (1) (G2, b 2 ,c 2 ) (Gi,bi,Cy), 
and (2) (Gi,a>i,Ci) (G 2 , &2, c 2 ). By the induction hypothesis, we obtain 

(&l,ci) £ e2(Gi) and (ai,ci) ^ ei((?i). Now, this c\ contradicts that (ax, 61) £ 
ei / e2(Gi)- If fc > 1, the argument is similar. Also, the argument for the if- 
direction is similar to the argument for the only-if direction; it uses the Left 
Residual Back condition in the definition of bisimilarity. 

The case of a right residual is analogous to that of a left residual. □ 

Lemma 18 (Representation lemma — bisimilarity incl. residuals). Let T be a 

fragment of the calculus of relations containing the difference and the residual 
operations. Let fc be a natural number and let Gi = (Gi,ai,bi) be a marked 
structure. There exists an expression e^' k in Tk such that for every structure 
G2 with node set V 2 : 

ej;°(£ 2 ) = {(o 2 ,o 2 ) £ pathsf (&) I Gi ^ (£2,02,62)}, and 
fork>l: e^ fe (£ 2 ) = {(o 2 ,6 2 ) £V 2 xV 2 \Gi (£ 2 ,o 2 ,6 2 )}. 

Proof. The proof proceeds like that of Lemma [llj we only discuss what is new. 
The expression for fc = is the same as before. In the inductive step, the 
expression has the following general form: 

F,k+1 ((1 T ) -F ) 

Q 1 xv ' Cfi ,posatoms y T y i ,ncgatoms y 

„ JT.fc+l _ JF,fc+l 

n ld— ■ n ip— 

Gi composition forth Q i .composition back 
„ IF,k+l T.k+l 

Q 1 , projection forth Q \ .projection back 

^1, left res forth left res back 

n# +1 r nC 1 fc+1 

C/i,rightrcs forth C?i,rightrcs back 

The subexpressions have different definitions depending on whether the degree 
is 1 or higher. For degree 1, the subexpressions for composition and projection 



(back and forth) are as in Lemma 14 The other subexpressions for degree 1 are 
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as follows: 



r 6i, leftrcs forth 



n 

VCV! 



' U ^J/f^-U^ 



Cl) 



£7i,leftres back 



:= |J (paths^-ef^^^/ef^^ 



<?i,rightres forth 
C?i,rightrcs back 



VciGVi / 

n ff^-u^,J\f u 



U e (fi£ci,aO \ (pathsf -e£° Cli6i; 



For k > 0, we define 



~ C?! composition forth 



ciGVi 



„.F,fc+l 
^ i .composition back 



projection forth 



j, fe+ i 

r t?i projection back 



.F,fc + 1 

Q i .leftrcs forth 



Q\., leftrcs back 



.F,fc + 1 

£?i , right res forth 



1 ,rightrcs back 



n.F,fc JF.k 
e (Si,oi,ci) ° e (Si,ci,6i)' 

u (n(<i«^r° n (^w) 

'cvi cjey cieVi-v 
fl 7r i( e fe 1 fe a 1 , Cl) ) n fl ^(egf^^j); 

ciGVi ciGVi 

771 ( U e (fil 1 *a 1 ,c 1 )) C ) C - 7r 2( U e (fi£ci.ai)) C; 

ciGVi 

U ^.aLCi) / U 6 
U ^(sf.oi.ci)) / e (sf,6i,ci)^ ' 

U 4i fc .ci,-i) \ u (Si,ci,6i; 

lev 7 " / VcieVi-v 

U ^'f.ci.oi) \ ( e (Si fe ,ci,6i)) ) • 



n 



n 

vcvr L \ci 



Note that the above expressions use complementation, which is allowed since 
complementation is definable as e c = (0/0) — e, an expression which has the 
same degree as e provided the degree of e is at least one. (Recall that 0/0 = 1.) 
To show correctness, we show that (02,62) € „ r , (G2) iff the Left 

x 7 Q\ , leftrcs forth v 7 

Residual Forth condition for Q\ ~f {Q 2 , 02,62) is satisfied. Inspecting the ex- 
pression, we see that (02, 62) € ?/>- (G2) iff for all subsets V C Vi and 

v 7 C/i, leftrcs forth v 7 

for all c 2 G V 2 , if (62,02) is in paths (£ 2 ) - U Cl ev e (Si°,6i,ci)(^ 2 )' thcn ( a 2> c 2) is 
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in UcjeVi-v e (g 1 ai ci)(^ 2 )- ^ n °t ner words, for all subsets V C V\ and for all 
c 2 G V 2 with (62,02) € paths (t/ 2 ): 

^^eUC^) V (° 2 ' C2 ) e U ^.cO^)" 
ciev cuevi-v 

Equivalently, for all c 2 € V 2 with (6 2 ,c 2 ) G paths (Cy 2 ): 

vcvj \ciev GieVi-y / 

Now, using commutativity of the logical 'or', and distributivity of the logical 
'and' over the logical 'or', we can equivalently write this as follows: for all 
c 2 G V 2 with (6 2 ,c 2 ) € paths (t/ 2 ) 

V (fe,c 2 )ee[^ iA) (g 2 ) A (a2,c 2 ) eeJ; O ai ci) (0 2 )J , 

which is, by the result for fc = 0, exactly the Left Residual Forth condition. 
We next show that (a 2 ,6 2 ) G tAJ' 1 , r , , iff the Left Residual Back con- 

v 7 r C/i Jcftrcs back 

dition for C?i ~f (5 2 ,a 2 ,6 2 ) is satisfied. Inspecting the expression, we see that 
(02, 62) G V£' lcftrcs back iff for every c x G ^ we have 

(o 2 ,6 2 ) g (pathsf -eg° aitCl) ) /ej^^j^a). 

By definition of the left residual, the above means that there exists c 2 G V" 2 such 
that 

(62, c 2 ) G efg° i bi Ci) (G2) A (o 2 , c 2 ) ^ pathsf -e^° aijCi) (S2)- 
Equivalently, for each c\ G V\ there exists c 2 G V 2 such that 

(62,02) G ef g ° i bi Ci) (G 2 ) A ((a 2 ,c 2 ) £ pathsf (&) V 4'° aijCi) (&)) , 

which is, by the result for k = 0, exactly the Left Residual Back condition. 

Similar arguments are used to show the corresponding equivalences regarding 
^ ,k+ } r , and ib^ ,k , + } , , , and also the arguments for the right residual 

r 6i,leftres forth Y Qx Jeftres back ' to b 

are similar. □ 
The results in Section |4~T1 and Section l4~2l lead us to 



Theorem 19. Let T be a fragment of the calculus of relations containing the 
difference operation. Let k be a natural number and let Q\ = ((?i , <ii , &i ) and 
G2 = (£2,02,62) be marked structures. Then, Q\ Qi if and only if Gi =F k 

Gi. 
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13 



17 



Proof. The only-if direction has already been given by Invariance Lemma 
(for fragments not containing residual operations) and Invariance Lemma 
(for fragments that do contain the residuals). 

The if-direction follows from Representation Lemma 14 (for fragments not 
containing residual operations) and from Representation Lemma 18 (for frag- 
ments that do contain a residual operation). In particular, let J 7 be a fragment 
not containing residual operations and assume Gi =? h G 2 - We consider two 
cases: (1) (di,&i) € paths f(g i), and (2) (ai,6i) ^ pathsf r (^i). For case (1), 



by Representation Lemma 



(a 2 ,b 2 ) € e£' fe (<7 2 ). 
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we have (oi, b\) 



By definition of , we obtain Gi 



and therefore also 
For case (2), 



the bisimulation conditions are vacuously true. 

For the case where J 7 is a fragment that does contain the residuals, assume 



again Q 
Lemma 



IS 



T k Gi- We again consider two cases. If k > 1, we use Representation 



to obtain Gi 



atoms condition of Gi 



$ 52- 



If k = 0, it is clear that Gi =jr a Q 2 implies the 

□ 



5 Similarity for fragments without set difference 

5.1 Fragments not containing residual operations 

In Section |3.1| we defined similarity for the fragment Cijt) (Definition [7|, and 
showed the invariance and representation lemmas (Lemma [9] and Lemma 10 ) 



For completeness, we list the more general definition of similarity for fragments 
not containing residual operations and give, without proof, the corresponding 
invariance lemma and representation lemma. 

Definition 20 (Similarity excl. residuals). Let J 7 be a fragment of the calculus 
of relations not containing the difference operation and not containing residual 
operations. Let A: be a natural number and let Gi = (Gi,ai,bi) and Gi = 
(G 2 , a 2 ,b 2 ) be marked structures with node sets Vi and V 2 , respectively. We say 
that Qi is .F-similar to Q 2 up to depth k, denoted Gi <•[ Q 2l if the following 
conditions are satisfied: 

Atoms atpjr(0i) C atpjr(C/ 2 ); 

Composition Forth if k > 0, then, for every c\ in V\ with (ai,ci) and 
(ci,6i) in paths^ r _ 1 (^/i), there exists c 2 in V 2 with (a 2 ,c 2 ) and (c 2 ,6 2 ) in 
paths^_ 1 (^ 2 ) such that both (Gx,ai,cx) (^2, «2, C2) and (Gi, c\, b\) ^-1 
(G 2 ,c 2 ,b 2 ); 

Projection Forth if tt is in (J 7 ), if k > 0, and if a% — b\, then, for every c\ in V\ 
with (ai,ci) in pathsfL i(Gi) (resp., {c\,ai) in pathsfLi^i)), there exists 
c 2 in V 2 with (a 2 ,c 2 ) in pathsf^^) (resp., (c 2 ,a 2 ) in pathsfLj^)) such 
that (<?i,ai,ci) <k-x {G 2l a 2 ,c 2 ) (resp., (Gi,ci,ai) (<? 2 , c 2 , a 2 )); 

Coprojection forth if 7f is in (J 7 ), if k > 0, and if a\ = b\, then, for every c 2 
in V 2 with (a 2 ,c 2 ) in pathsf.^^) (resp., (c 2 ,a 2 ) in pa,ths^_ 1 (G 2 )), there 
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exists ci in V\ with (ai, Ci) in pathsf r _i(^i) (rcsp., (ci, ai) in paths^l-^t/i)) 
such that (<? 2 ,a 2 ,c 2 ) ^{L x (0i,£Ji,Ci) (rcsp., (<? 2 ,c 2 ,a 2 ) (£1, Ci, ai)). 

Lemma 21 (Invariance lemma — similarity excl. residuals). Let T be a frag- 
ment of the calculus of relations not containing the difference operation and not 
containing residual operations. Let k be a natural number and let Q\ and Q 2 be 
marked structures. If G\ -<[ S2, then Q 1 ^>^ k Q 2 - 

Lemma 22 (Representation lemma — similarity excl. residuals). Let T be a 

fragment of the calculus of relations not containing the difference operation and 
not containing residual operations. Let k be a natural number and let Q\ = 
(Qi,ai,bi) be a marked structure. There exists an expression e^' k in Tk such 
that for every structure Q2 ■' 

ej; fc (£ 2 ) = {(a 2 ,b 2 ) e pathsf (&) \ Q x <l (£ 2 ,a 2 ,fo 2 )}. 

5.2 Fragments with the residuals 

Definition 23 (Similarity incl. residuals). Let J 7 be a fragment of the calculus 
of relations containing a residual operation and not containing the difference 
operation. Let k be a natural number and let Gi = (£i,<Zi,&i) and Q 2 = 
(G2, a 2 , 6 2 ) be marked structures with node sets V\ and V 2 , respectively. We say 
that Si is ^-similar to Q 2 up to depth k, denoted Q\ <^ Q 2 , if the following 
conditions are satisfied: 

Atoms atpj-(C/i) C atp jr (^ 2 ); 

Composition Forth if k > 0, then, for every c\ in V±, there exists c 2 in V 2 such 
that both (</i,ai,Ci) ■<■£-! (^2,a 2 ,c 2 ) and (</i,Ci,fri) ■<■£-! (^2,c 2 ,fo 2 ); 

Projection Forth if 7r is in (J 7 ), if k > 0, and if a\ =61, then, for every c\ 
in Vi, there exists c 2 in V 2 such that (Q\,a\,c\) ^-1 (£ 2 ,a 2 ,c 2 ) (resp., 
(Si,ci,ai) dik-i (G2,c 2 ,a 2 )); 

Left Residual Forth if k > 0, then, for every c 2 in V 2 , there exists c\ in V\ 
such that both (G 2 ,b 2 ,c 2 ) (Qi , bi , Ci ) and (</i,ai,Ci) (<? 2 ,a 2 ,c 2 ); 

Right Residual Forth if fc > 0, then, for every c 2 in V 2 , there exists c\ 
in Vi such that both (Q 2 ,c 2 ,a 2 ) ^-1 (Si,ci,ai) and (C/i,Ci,i>i) 
(S 2 ,c 2 ,& 2 ). 

Unlike the situation for bisimilarity, here no special case for k — 1 is needed, 
due to the general principle that whenever (a\,c{) ^ paths^(C?i) (or (ci,&i) ^ 
pathsjf (<?i)) then (</i,ai,Ci) ^o" S2 (or (C/i,Ci,£>i) ^jf £/ 2 ) holds trivially for 
any Q 2 . 

The following lemma is proven in the same way as the preceding invariance 
lemmas. 
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Lemma 24 (Invariance lemma — similarity incl. residuals). Let T be a fragment 
of the calculus of relations containing a residual operation and not containing 
the difference operation. Let k be a natural number and let Gi and G2 be marked 
structures. If Gi S 2 , then Gi ^>r k G 2 - 

Proof. The proof proceeds as the proof of Lemma [ITJ □ 
We finally show 

Lemma 25 (Representation lemma — similarity incl. residuals). Let J- be a frag- 
ment of the calculus of relations containing a residual operation and not contain- 
ing the difference operation. Let k be a natural number and letQi = {Gi,a\,bi) 
be a marked structure. There exists an expression e~ in Tu such that for every 
structure Q 2 : 

e?'°(02) = {(02,^2) G pathsf (£ 2 ) I Gi dio (G2,a 2 ,b 2 )}, and 
fork>\: e^ k (G 2 ) = {(a 2 ,b 2 ) eV 2 xV 2 \Gi ■<£ (02,^2, 62)}. 

Proof. The proof proceeds as for the preceding Representation Lemmas [TU] and 
[T8| The expression for k = is the same as in Lemma [lOj In the inductive step 
the expression has the following general form: 



T,k+1 



inxi 



T 

Q 1 , atoms 

p /^J^fe+l p iJ^.k+l 

Q 1 .composition forth Q\ .projection forth 



p ,k+\ p >JF,fc+l 

yijcftrcs forth C/i,rightrcs forth 



with 



'61 .leftres forth 



n 

VCVi 



u 



vcievi-v 



n u * 



and 



£?l,rightres forth 



n 

VC.V 1 



( 



n u 



\ (ci,ai)ge(£Fi) / 



\( u • 

\c1eV1-y 



J=,k 
2 (Si,ci,6i) 



The correctness argument involves no new insights beyond the proofs of the 
previous representation lemmas. □ 



5.3 Characterization theorem 

From the preceding two subsections, in a similar way as Theorem |19| we obtain 
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Theorem 26. Let J 7 be a fragment of the calculus of relations not containing 
the difference operation. Let k be a natural number and let Q\ = (Gi,a\,bi) 
and G2 = (G2, 02,62) be marked structures. Then, Q\ <•[ Q 2 if and only if 
Gi Q2- 



6 Indistinguishability of finite structures 

The characterizations we have given of when two structures are indistinguishable 
by expressions of J-^, for some fixed degree k and some fixed fragment J-, are 
valid for arbitrary structures. We may now ask, given two finite structures as 
input, if it is effectively decidable whether or not they are indistinguishable by 
expressions of J-^. Since there are only a fixed, finite number of expressions in 
J~k, this problem is obviously decidable in polynomial time. But how can we 
decide whether two given finite structures are indistinguishable by expressions 
of T without a bound on the degree? In this section, using the preceding results, 
we will show that this problem is also decidable in polynomial time. 

For simplicity of presentation, we will work with the specific fragment C(ff) 



already used as an example fragment in Section 3.1 The generalization of the 
results in this section to the other fragments is left as an exercise to the reader. 

Recall from Lemma [3] that expressions of C(tt) of degree at most k can only 
return pairs that belong to paths^- 1 . Also recall from the discussion preceding 

that lemma that paths^ is the same as paths^ and returns all pairs (a, b) of 
nodes such that b is reachable from a by a path of length at most 2 k . In the 
following we will abbreviate paths fc simply as paths fc . 

Recall the notion of similarity up to depth k from Definition [7j We can 
define this notion equivalently through the following notion of simulation. 

Definition 27. Let Gi and G2 be two structures with node sets V\ and V2 
respectively. Let k be a natural number. Let Z = (Zq, Z\, . . . , Z^) and W = 
(Wo, Wi, . . . , W fc ) be tuples of relations with Z, C V? x V 2 2 and W l C V 2 2 x V? 
for i = 0, . . . , k. The pair (Z, W) is called a simulation up to depth k from Qi 
to Q 2 if the following conditions are satisfied: 

Atoms Assume (01,61,02,62) £ Z^. If a\ = b\, then a 2 = 62; furthermore, for 
each R G A, if (ai,h) e R Gl , then (a 2 ,6 2 ) G R G2 ■ 

Composition Forth Assume (ai, 61, 0,2, 6 2 ) € Zi with i > 0. Then for every 
ci e V\ with (01, ci) and (ci, b\) in paths i _i(^i), there exists c 2 £ V2 with 
(02,02) and (02,62) in paths,^!^), such that both (01,01,02,02) G -Zi-i 
and (ci,6i,c 2 ,6 2 ) € Z i _ 1 . 

Coprojection Forth Assume (01,61,02,62) £ Zi with i > and ai = 61 

(whence a 2 = 62 by the Atoms condition). Then for every C2 £ V 2 with 
(02,02) in paths i _ 1 (t/2), there exists ci £ V± with (ai,ci) in paths i _ 1 (^i) 
such that (02, C2, ai, ci) £ Wi-%. Furthermore, for every C2 £ V2 with 
(02,02) in pathSj_i((/2), there exists ci £ V± with (ci,ai) in paths i _ 1 (^i) 
such that (c2, 02, ci, ai) £ Wi-%. 
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Reverse conditions The above three conditions repeated, but with Q\ and 
Q2, and Z and W, exchanged: 

Reverse Atoms Assume (02, 62, a>i> 61) € Wj. If 02 — 62, then a\ =by\ 
furthermore, for each R G A, if (02,62) G i?^ 2 , then (ai,6i) € 

Reverse Composition Forth Assume (02, 62, a i, 61) G Wj with i > 0. 
Then for every C2 € V2 with (02,02) and (02,62) m paths i _ 1 (^2)i 
there exists ci G Vt with (oi,Ci) and (ci,6i) in paths i _ 1 (tJi), such 
that both (a 2 , c 2 , ai, ci) € W,_i and (c 2 , 62, ci, 61) G Wi-%. 

Reverse Coprojection Forth Assume (02,62,01,61) G Wi with i > 
and 02 = 62 (whence ai = 61 by the Atoms condition). Then for every 
ci G Vi with (ai, ci) in paths i _ 1 (C/i), there exists C2 € V2 with (02, C2) 
in paths i _ 1 (^2) such that (01,01,02,02) G ^i— i- Furthermore, for 
every ci € Vi with (cj.,ai) in paths i _i(C/i), there exists C2 € V2 with 
(c 2 ,a 2 ) in paths^x^a) such that (ci, a\, c 2 , a 2 ) G Zi-i- 

The above definition of simulation corresponds to the notion of similarity we 
already have. We will prove this in the next Proposition [30j For that proof, we 
first need the following definition and two lemmas. 

For two tuples of relations Z and Z' of the same length, we define their 
union Z" = Z U Z' in the obvious pointwise manner by Z" :— Zi U Z[ for each 
i. Similarly for two simulations (Z, W) and (Z 1 , W') up to the same depth, we 
can define their union as [Z U Z' ' , W U W'). The proof of the following lemma 
is straightforward. 

Lemma 28. If (Z, W) and (Z' , W) are simulations from Q\ to Q2 up to depth 
k, then their union (Z U Z' , W U W) is also a simulation from Q\ to Q2 up to 
depth k. 

Since the three reversed conditions in the definition of simulation are com- 
pletely symmetric to the first three conditions, we also have the following: 

Lemma 29. (Z, W) is a simulation from Q\ to Q2 if and only if (W, Z) is a 
simulation from Q2 to Q\. 

We now state: 

Proposition 30. (£1,01,61) z^u (£2,02,62) if and only if there exists a sim- 
ulation (Z,W) from Qi to Q2 such that (01,61,02,62) € Z^. 

Proof. The if-direction is immediately verified by induction on k. The only- if 
direction can also be proven by induction on k. The case k = is clear. Now 
assume (£i,ai,&i) ^+1 (£2,02,62). Then for each c\ G V\ with (ai,ci) and 
(ci, 61) in pathsj.(0i), there exists c 2 G V2 with (02, C2) and (C2, 62) in paths fc (Cf2), 
such that both (£i,ai,ci) dik (£2,02,02) and (£i,ci,6i) ^ fe (£2,c 2 ,6 2 ). By 
induction, this is equivalent to the existence of C2 G V% and of simulations 
up to depth k (Zgq,W££) and (Zg&,Wg&) from Q x to G 2 such that 

(ai,ci,02,c 2 ) G (#forih)k and (ci, 61,02,62) S (^for'th)fc- 
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Furthermore, if a\ = b\ (and 02 — 62), then for every ci G Vi with 
(02,02) € paths fe (^2), there exists c\ G V\ with (ai,c\) G paths fc ((?i) such 



that (^2,02,02) ^fc (0i,ai,ci). By induction, and Lemma 29 this is equivalent 



to the existence of c Y G V\ and a simulation (^"oproja' ^coproj 2 ) U P to depth fc 
from C/i to t/ 2 such that (02, C2, Oi, Ci) € (^roproj 2 ) fc - Similarly, by the second 
part of the Coprojection Forth condition, for every c 2 G V2 there exists C\ G Vi 
and a simulation (^coproji ' ^coproji ) U P to depth k from £1 to Q 2 such that 

All the above simulations can be extended up to depth k + 1 by setting the 
k + 1st component to empty. The desired simulation is now obtained by taking 



the union of all these simulations (using Lemma 28 ) , to which we add the single 



tuple (01, bi, a,2, 62) in the k + 1st component. □ 

For two tuples of relations Z and Z' of the same length, we say that Z C Z' 
if the inclusion holds pointwise, i.e., Zj C Z[ for every i. Similarly, for two 
simulations (Z,W) and (Z',W) up to the same depth, we define (Z ,W) C 
(Z\ W') iiZQZ' and W C W - '. 

We conclude: 

Proposition 31. Given two finite structures Qi and Qi and a natural num- 
ber k, there is a unique simulation from Qi and Q2 up to depth k, denoted by 
Sim(0i, 02) k), that is maximal in that every other simulation from Q\ to Q2 up 
to depth k is included in Sim(C?i, Q2, k). Moreover, (0i,ffii,6i) ^fc (G2, o, 2 ,b 2 ) if 
and only (ai, bi, 02, 62) G Z^ where Sim(0i, G2, k) = (Z, W). 

Proof. The maximal simulation equals the union of all simulations from Qi to 
Q2 up to depth k. There are only finitely many such simulations since the 



structures are finite; their union is a simulation by Lemma 28 The proposition 



then follows from the previous proposition. □ 

For later use we note the following property of the maximal simulation, which 
follows immediately from the above proposition, Proposition [8j and Lemma [29} 

Lemma 32. The maximal simulation (Z, W) = Sira(Qi,Q2,k) up to depth k 
is monotonically decreasing, i.e., it satisfies Zi D Zi+\ and Wt 2 Wi+i for 
< i < k. 

We next introduce an operator by which a simulation up to depth k can be 
refined up to depth k + 1. 

Definition 33. Let Gi and G2 be two structures with node sets Vj. and V2 
respectively, and let fcbea natural number. Let Z C x V 2 2 and W C V 2 2 x V 2 . 
We define Refine/ c+ i(Z, W) to be the pair (Z' , W) where 

• Z' is the set of all tuples (01,61,02,62) G Z satisfying the following two 
conditions: 

Composition Forth For every c\ G V\ with (ai, c\) and (ci, 61) in paths fc (C/i), 
there exists C2 G V2 with (02, C2) and (02, 62) in paths fc (C/2), such that 
both (ai, ci, a 2 , c 2 ) G Z and (ci, 61, c 2 , 6 2 ) G Z. 
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Coprojection Forth Assuming a± = &i, then for every c 2 £ V2 with 
(02, C2) in paths fc (C/2), there exists ci £ V\ with (ai, ci) in paths fc (£/i) 
such that (02, C2, ai, ci) £ VF. Furthermore, for every C2 € V2 
with (02,02) in paths i _ 1 (^2): there exists C\ £ V\ with (ci,ax) in 
paths i _ 1 (tj 1 ) such that (C2, a 2 , Ci, a x ) e VF. 

• W is the set of all tuples (02,62,01,61) £ W satisfying the following two 
conditions: 

Reverse Composition Forth For every C2 £ V2 with (02, C2) and (02, 62) 
in paths fc (C/2), there exists c\ £ V\ with (ai,ci) and (ci,&i) in paths fe (^i), 
such that both (a 2 , C2, ai, Ci) £ W and (02, 62, ci, 61) € W. 

Reverse Coprojection Forth Assuming 02 = 62, for every ci £ V\ with 
(ai,ci) in paths fe (Cfi), there exists C2 £ V% with (02,02) in paths fc (C?2) 
such that (ai, ci, 02, C2) G Z. Furthermore, for every c\ £ V\ with 
(ci, ai) in paths fc (t/i), there exists C2 £ V2 with (C2, 02) in paths fc (C*2) 
such that (ci, ai, C2, 02) € Z. 

The two main properties of the Refine operator are: 

Lemma 34. Let S = (Z, W) be a simulation from Qi to Q 2 up to depth k, let 
(Z k+u W k+1 ) =Rcfmc fc+ i(Z fc ,VF fc ), and let T = {(Z, Z k+1 ), (W, W k+1 )). Then 

1. T is a simulation from Q\ to Q2 up to depth k + 1. 

2. Assume Q\ and Q2 are finite. If S is the maximal simulation from Qi to 
Q2 up to depth k, then T is the maximal simulation from Q\ to Q2 up to 
depth k + 1 . 

Proof. The proof of the first property is immediate. To show maximality, con- 
sider the maximal simulation S' — (Z',W') from Q\ to G2 up to depth k + 1. 
We have to show that S' C T. Since S is maximal up to depth k, we know that 
{{Z' Q , . . . , Z' k ), (Wq, . . . , W k )), which is a simulation up to depth k, is included 
S. So it remains to show (Z' k+1 ,W k+1 ) C (Z k +i,W k+ i). By Lemma 



:V2 



we have Z' k+1 C Z' k C Z k and W k+1 C W' k C W k - Then by comparing the 
definition of simulation up to depth k + 1 with Definition |33[ it is clear that 
(Z' k+1 ,W k+1 ) C (Z k+1 , W k+l ), as desired. □ 

We can finally conclude: 

Theorem 35. The following problem is decidable in polynomial time: given 
two finite marked structures Q\ and Q2, decide whether Q\ =c{9) Q2 holds. 

Proof. Let (Zq,Wq) be the maximal simulation from Q\ to Q2 up to depth 0. 
So, Z consists of all tuples (01,61,02,62) £ V\ x V 2 2 that satisfy the Atoms 
condition for i — 0, and Wo consists of all tuples (02, 62, a\, b\) £ V 2 2 x Vf that 
satisfy the Reversed Atoms condition for i = 0. Now for every natural number 



k > 0, define by induction (Z k , W k ) := RcfinCfc(Zfe_i, W k ~i). By Lemma 34 



this way we obtain the maximal simulation from Q\ to Q2 up to ever increasing 
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depths. By Proposition [3l| and Theorem 11 we have (01,61,02,62) € Z% iff 

(Si, 0l,6l) =C(7T) k (£2,a2,62). 

For {Qi, ai, 61) =c(») (£2, ^2, 62) to hold, we need to verify whether (ai, 61, 02, 62) £ 
.Zfc for natural numbers k. Since Q\ and Q 2 & re finite, and the sequence is 
monotonically decreasing (Lemma 32 ) , there exists £ such that Z/, = Zg for all 
k > £. It thus suffices to compute (Zk,Wk) for increasing k until no changes 
occur and check whether (01,61,02,62) G Z$_. Denoting the maximum of the 
cardinalities of V\ and V2 by n, we have I < 2n 4 , since in the worst case each 
iteration decreases one of Zj. or Wk by a single 4-tuple. Each iteration of the 
Refine operator is clearly computable in polynomial time. Thus the theorem is 
proved. □ 



7 Concluding remark 

In our work, we have always included the identity relation and the three op- 
erations union, intersection and composition in the logics that we consider. 
It is an interesting topic for further research to see what happens if some of 
these operators are left out. In some aspects, the problems can change drasti- 
cally. For example, consider the logic consisting only of composition and nothing 
else. Then indistinguishability of finite structures amounts to the equivalence 
problem for finite automata, which is PSPACE-complete [4], as opposed to the 
polynomial-time decidability we established for the fragments considered in this 
paper. 
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